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This is in response to the amendment filed July 26, 2004. 

DETAILED ACTION 
Response to Arguments 

Applicants arguments filed July 26, 2004 have been fully considered but they are not 
persuasive. 

Applicant argues on pages 10-12 that " neither Meyerzon nor Nelson, singularly or in 
combination, teach or suggest extracting a portion of the document that characterizes the 
document's subject content to form the document extract." Examiner respectfully disagrees. 
Nelson discloses a processing system that separates or decomposes the "multimedia document" 
(1 10, Fig.2, Nelson) into "list of multimedia components" of different data types (120, Fig.2; 
Fig.4 and col.5, lines 52-55), convert a single block of component data into a list of tokens, these 
tokens will stored in the multimedia index then presented to the user as "search result" including 
"document title", "document summary" and other useful form (col.5, line 52-col.6, line 65; col.7, 
lines 46-67 and col.9, lines 60-65, Nelson). The "extracting a portion of the document that 
characterizes the document's subject content to form the document extract" must be performed 
in steps 110 and 120, Fig.2 of Nelson. 

Claim Rejections - 35 USC §103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 
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1. Claims 1-6, 8-14, 16-20 and 22-23 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Meyerzon et al. (US Patent no. 6,631,369) in view of Nelson et al. (US Patent 
no. 6,243,713). 

Regarding claims 1 and 9, Meyerzon discloses a method for retrieving information using 
a search engine comprising the steps of: 

(a) retrieving a document to be indexed (see col.4, lines 43-54, Meyerzon); 

(b) generating a document extract corresponding to the document (see col. 4, lines 53-67, 
Meyerzon); and 

(d) storing the plurality of tokens in a search index, wherein the search engine accesses 
the search index to retrieve information in one or more document extracts satisfying a search 
query (see col. 7, lines 44-65 and col.8, lines 1-10, Meyerzon. The data type of information 
corresponding to the "token"). 

Meyerzon, however, does not explicitly disclose extracting a portion of the document that 
characterizes the document's subject content to form the document extract and decomposing the 
document extract into a plurality of tokens. Nelson, on the other hand, discloses the retrieval 
system for retrieval of multimedia information including the extracting a portion of the document 
and decomposing the document into a plurality of tokens (see abstract of Nelson; col. 5, line 52- 
col.6, line 65; col.7, lines 46-67 and col.9, lines 60-65). It would have been obvious to one of 
ordinary skill in the art at the time of the invention to modify Meyerzon to include the claimed 
feature as taught by Nelson. The motivation of doing so would have been to improve the 
efficiency of incremental crawls that are used to manage document stores (see col. 3, lines 65-67, 
Meyerzon). 
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Regarding claim 17, Meyerzon discloses a system for retrieving information, wherein the 
system includes a search engine comprising: 

- means for retrieving a document from a document repository (see col.4, lines 43-54 
and element 200, Fig.2 and corresponding text, Meyerzon); 

- an information extractor coupled to the means for retrieving, wherein the information 
extractor generates a document extract corresponding to the document (see col. 4, 
lines 53-67, Meyerzon). Each document is retrieved form the web site process and the 
data is extracted from each of these retrieved documents. Therefore, there must be an 
extractor for the extracting process; 

- a storage device (100, Fig.2 and corresponding text, Meyerzon) coupled to the 
information extractor for storing the document extract; 

- a search engine indexer (300, Fig.2) coupled to the storage device; and 

- a search index (400, Fig.2) coupled to the search engine indexer for storing the 
plurality of tokens, wherein the search engine accesses the search index to retrieve 
information in one or more document extracts satisfying a search query (see col. 7, 
lines 44-65 and col. 8, lines 1-10; Fig.2 and corresponding text, Meyerzon). 

Meyerzon, however, does not explicitly disclose the steps of extracting a portion of the 
document that characterizes the document's subject content to form the document extract and 
decomposing the document extract into a plurality of tokens. Nelson, on the other hand, discloses 
the retrieval system for retrieval of multimedia information including the decomposing the 
document into a plurality of tokens (see abstract of Nelson; col.5, line 52-col.6, line 65; col.7, 
lines 46-67 and col.9, lines 60-65). It would have been obvious to one of ordinary skill in the art 



Application/Control Number: 09/989,970 Page 5 

Art Unit: 2161 

at the time of the invention to modify Meyerzon to include the claimed feature as taught by 
Nelson. The motivation of doing so would have been to improve the efficiency of incremental 
crawls that are used to manage document stores (see col. 3, lines 65-67, Meyerzon). 

Regarding claims 2, 10 and 18, Meyerzon/Nelson combination further discloses the steps 
of (bl) extracting a portion of the document that characterizes the document's subject content to 
form the document extract; and (b2) recording positional information of the portion extracted 
within the document (see col. 6, lines 1-10, Nelson). 

Regarding claims 3 and 11, Meyerzon/Nelson combination further discloses the step of 
storing the document extract in a storage device (see Fig.2 and corresponding text, Meyerzon). 

Regarding claims 4, 12 and 19, Meyerzon/Nelson combination further discloses the step 
of storing the recorded positional information with the plurality of tokens (see col.6, lines 1-34, 
Nelson). 

Regarding claims 5 and 13, Meyerzon/Nelson combination further discloses the step 
Meyerzon/Nelson combination further discloses extracting from the document a collection of 
sentences that are characteristic of the document's subject content to form a document summary 
(see abstract, col.5, line 52-col.6, line 65; col.7, lines 46-67 and col.9, lines 60-65, Nelson). 

Regarding claims 6, 14 and 20, Meyerzon/Nelson combination discloses the step of 
selecting from the document extract one of a whole sentence, a portion of a sentence, a word, and 
a feature, (see col.6, lines 16-34; col.7, lines 46-67 and col.9, lines 60-65, Nelson). 

Regarding claims 8, 16 and 22, Meyerzon/Nelson combination further discloses that the 
document is a web-page in the Internet (see Fig.2, Meyerzon). 
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Regarding claim 23, Meyerzon/Nelson combination further discloses the means for 
retrieving the document is a web crawler (see abstract of Meyerzon). 

2. Claims 7, 15 and 21 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Meyerzon et al. (US Patent no. 6,631,369) in view of Nelson et al. (US Patent no. 6,243,713), 
and further in view of Smadja (US 6,621,930). 

Regarding claims 7, 15 and 21, Meyerzon/Nelson combination discloses all of the 
claimed limitation as discussed above except the step of selecting tokens based on frequency of 
occurrence, word-salient-measure, proximity to the beginning of a paragraph, proximity the 
beginning of the document, and proximity to or position within a heading or a caption. Smadja, 
on the other hand, discloses an electronic device automatically classifies documents based upon 
textual content including the frequency of occurrence of the token in the selected document 
(col.3, lines 8-34; col.4, lines 46-51 and 57-65; col. 13, lines 16-19 and col.14, lines 5-9, 
Smadja). It would have been obvious to one of ordinary skill in the art at the time of the 
invention was made to modify the combination system of Meyerzon and Nelson to include the 
claimed feature as taught by Smadja. The motivation of doing so would have been to provide 
more accurate search result based on the index. 

Conclusion 

3. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. 

Messerly et al. (US 6,076,051) disclose an information retrieval utilizing semantic 
representation of text. 
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4. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Hanh B Thai whose telephone number is 571-272-4029. The 
examiner can normally be reached on 8 AM - 4:30 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Safet Metjahic can be reached on 571-272-4023. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 

Hanh B Thai 
Examiner 
Art Unit 2161 

January 4, 2005 




SAFET METJAHIC 
'^RVISORY PATENT EXAMINER 
- ory cpmtpr 2100 



